ART: Robustness of Meshes and Tori for Parallel and Distributed Computation
نویسندگان
چکیده
In this paper, we formulate the array robustness theorems (ARTs) for efficient computation and communication on faulty arrays. No hardware redundancy is required and no assumption is made about the availability of a complete submesh or subtorus. Based on ARTs, a very wide variety of problems, including sorting, FFT, total exchange, permutation, and some matrix operations, can be solved with a slowdown factor of 1+ o(1). The number of faults tolerated by ARTs ranges from o(min(n1 1 d ; n d ; n h)) for nary d-cubes with worst-case faults to as large as o(N) for most N-node 2-D meshes or tori with random faults, where h is the number of data items per processor. The resultant running times are the best results reported thus far for solving many problems on faulty arrays. Based on ARTs and several other components such as robust libraries, the priority emulation discipline, and X Y 0 routing, we introduce the robust adaptation interface layer (RAIL) as a middleware between ordinary algorithms/programs (that are originally developed for fault-free arrays) and the faulty network/hardware. In effect, RAIL provides a virtual fault-free network to higher layers, while ordinary algorithms/programs are transformed through RAIL into corresponding robust algorithms/programs that can run on faulty networks.
منابع مشابه
Fault-Tolerant Meshes and Tori Embedded in a Faulty Supercube
Hypercubes, meshes, and tori are well known interconnection networks for parallel computing. The Supercube network is a generalization of the hypercube. The main advantage of this network is that it has the same connectivity and diameter as that of the hypercube without the constraint that the number of nodes be a power of 2. This paper proposes novel algorithms of fault-tolerant meshes and tor...
متن کاملHoneycomb Networks: Topological Properties and Communication Algorithms
The honeycomb mesh, based on hexagonal plane tessellation, is considered as a multiprocessor interconnection network. A honeycomb mesh network with n nodes has degree 3 and diameter a 1.63 n 1, which is 25 percent smaller degree and 18.5 percent smaller diameter than the mesh-connected computer with approximately the same number of nodes. Vertex and edge symmetric honeycomb torus network is obt...
متن کاملA Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver
In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...
متن کاملOn Fault-Tolerant Embedding of Meshes and Tori in a Flexible Hypercube with Unbounded Expansion
The Flexible Hypercubes are superior to hypercube in terms of embedding a mesh and torus under faults. Therefore, this paper presented techniques to enhance the novel algorithm for fault-tolerant meshes and tori embedded in Flexible Hypercubes with node failures. The paper demonstrates that O(n 2 -log2m 2 ) faults can be tolerated and the algorithm is optimized mainly for balancing the proce...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کامل